AITopics | data infrastructure

Collaborating Authors

data infrastructure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Building a strong data infrastructure for AI agent success

MIT Technology ReviewMar-10-2026, 14:00:00 GMT

As companies race to adopt agentic AI to spur innovation and gain efficiency, building the right enterprise data infrastructure has become a critical component of success. In the race to adopt and show value from AI, enterprises are moving faster than ever to deploy agentic AI as copilots, assistants, and autonomous task-runners. In late 2025, nearly two-thirds of companies were experimenting with AI agents, while 88% were using AI in at least one business function, up from 78% in 2024, according to McKinsey's annual AI report . Yet, while early pilots often succeed, only one in 10 companies actually scaled their AI agents. One major issue: AI agents are only as effective as the data foundation supporting them. Experts argue that most companies are seeing delays in implementing AI, not because of shortcomings in the models, but because they lack data architectures that deliver business context to be reliably used by humans and agents.

agent, artificial intelligence, social media, (13 more...)

MIT Technology Review

Country: North America > United States > Massachusetts (0.05)

Industry: Information Technology (0.51)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

A Systematic Review of NeurIPS Dataset Management Practices

Neural Information Processing SystemsFeb-11-2026, 08:25:57 GMT

Datasets serve as a fundamental bedrock for machine learning models.

artificial intelligence, machine learning, natural language, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Law > Intellectual Property & Technology Law (0.68)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(2 more...)

Add feedback

A Systematic Review of NeurIPS Dataset Management Practices

Neural Information Processing SystemsOct-9-2025, 23:37:08 GMT

Datasets serve as a fundamental bedrock for machine learning models.

dataset, dataset paper, license, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Law > Intellectual Property & Technology Law (0.68)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(2 more...)

Add feedback

A Systematic Review of NeurIPS Dataset Management Practices

Wu, Yiwei, Ajmani, Leah, Longpre, Shayne, Li, Hanlin

arXiv.org Artificial IntelligenceOct-31-2024

As new machine learning methods demand larger training datasets, researchers and developers face significant challenges in dataset management. Although ethics reviews, documentation, and checklists have been established, it remains uncertain whether consistent dataset management practices exist across the community. This lack of a comprehensive overview hinders our ability to diagnose and address fundamental tensions and ethical issues related to managing large datasets. We present a systematic review of datasets published at the NeurIPS Datasets and Benchmarks track, focusing on four key aspects: provenance, distribution, ethical disclosure, and licensing. Our findings reveal that dataset provenance is often unclear due to ambiguous filtering and curation processes. Additionally, a variety of sites are used for dataset hosting, but only a few offer structured metadata and version control. These inconsistencies underscore the urgent need for standardized data infrastructures for the publication and management of datasets.

artificial intelligence, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2411.00266

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Law > Intellectual Property & Technology Law (0.68)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From the evolution of public data ecosystems to the evolving horizons of the forward-looking intelligent public data ecosystem empowered by emerging technologies

Nikiforova, Anastasija, Lnenicka, Martin, Milić, Petar, Luterek, Mariusz, Bolívar, Manuel Pedro Rodríguez

arXiv.org Artificial IntelligenceMay-22-2024

Public data ecosystems (PDEs) represent complex socio-technical systems crucial for optimizing data use in the public sector and outside it. Recognizing their multifaceted nature, previous research pro-posed a six-generation Evolutionary Model of Public Data Ecosystems (EMPDE). Designed as a result of a systematic literature review on the topic spanning three decade, this model, while theoretically robust, necessitates empirical validation to enhance its practical applicability. This study addresses this gap by validating the theoretical model through a real-life examination in five European countries - Latvia, Serbia, Czech Republic, Spain, and Poland. This empirical validation provides insights into PDEs dynamics and variations of implementations across contexts, particularly focusing on the 6th generation of forward-looking PDE generation named "Intelligent Public Data Generation" that represents a paradigm shift driven by emerging technologies such as cloud computing, Artificial Intelligence, Natural Language Processing tools, Generative AI, and Large Language Models (LLM) with potential to contribute to both automation and augmentation of business processes within these ecosystems. By transcending their traditional status as a mere component, evolving into both an actor and a stakeholder simultaneously, these technologies catalyze innovation and progress, enhancing PDE management strategies to align with societal, regulatory, and technical imperatives in the digital era.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.13606

Country:

Europe > Serbia (0.34)
Europe > Latvia (0.25)
North America > United States > Hawaii (0.04)
(9 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > E-government (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.37)

Add feedback

Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

Li, Minzhi, Shi, Weiyan, Ziems, Caleb, Yang, Diyi

arXiv.org Artificial IntelligenceFeb-27-2024

As Natural Language Processing (NLP) systems become increasingly integrated into human social life, these technologies will need to increasingly rely on social intelligence. Although there are many valuable datasets that benchmark isolated dimensions of social intelligence, there does not yet exist any body of work to join these threads into a cohesive subfield in which researchers can quickly identify research gaps and future directions. Towards this goal, we build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets. Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects. Our analyses demonstrate its Figure 1: Our Social Intelligence Data Infrastructure utility in enabling a thorough understanding of gives a comprehensive overview and synthesis of social current data landscape and providing a holistic intelligence in NLP, with a theoretically grounded taxonomy perspective on potential directions for future and an NLP data library. Researchers can use dataset development. We show there is a need our infrastructure to build and organize tasks, evaluate for multifaceted datasets, increased diversity in language models and derive future insights.

dataset, intelligence, social intelligence, (13 more...)

arXiv.org Artificial Intelligence

2403.14659

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Singapore (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.93)
Education (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)

Add feedback

The great acceleration: CIO perspectives on generative AI

MIT Technology ReviewJul-18-2023, 13:00:00 GMT

Although AI was recognized as strategically important before generative AI became prominent, our 2022 survey found CIOs' ambitions limited: while 94% of organizations were using AI in some way, only 14% were aiming to achieve "enterprise-wide" AI by 2025. By contrast, the power of generative AI tools to democratize AI--to spread it through every function of the enterprise, to support every employee, and to engage every customer --heralds an inflection point where AI can grow from a technology employed for particular use cases to one that truly defines the modern enterprise. As such, chief information officers and technical leaders will have to act decisively: embracing generative AI to seize its opportunities and avoid ceding competitive ground, while also making strategic decisions about data infrastructure, model ownership, workforce structure, and AI governance that will have long-term consequences for organizational success. This report explores the latest thinking of chief information officers at some of the world's largest and best-known companies, as well as experts from the public, private, and academic sectors. It presents their thoughts about AI against the backdrop of our global survey of 600 senior data and technology executives.

generative ai, machine learning, natural language, (12 more...)

MIT Technology Review

Genre: Questionnaire & Opinion Survey (0.57)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

The Different Approaches To MLOps, ModelOps, DataOps & AIOps - AI Summary

#artificialintelligenceMar-7-2023, 22:50:18 GMT

MLOps, ModelOps, DataOps and AIOps are rapidly growing in importance as organizations look to leverage the power of artificial intelligence, machine learning and big data. Each approach allows organizations to build reliable systems that can effectively process large amounts of data quickly and efficiently. MLOps focuses on a continuous delivery cycle for machine learning models through automated pipelines, ModelOps is used to manage model development from conception to deployment, DataOps provides tools for developing efficient data processing pipelines, while AIOps is an AI-driven operations platform that helps automate IT processes such as incident resolution. All four approaches offer different advantages when it comes to managing the production lifecycle of AI products across multiple environments. The intersection of machine learning, model management, and data infrastructure is an essential element for any organization looking to leverage the power of artificial intelligence.

artificial intelligence, machine learning, modelop, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Defining the Differences between MLOps, ModelOps, DataOps & AIOps

#artificialintelligenceMar-5-2023, 20:20:26 GMT

With the rise of artificial intelligence, machine learning and big data, organizations have become increasingly aware of the importance of MLOps (Machine Learning Operations), ModelOps, DataOps, and AIOps. Through this blog post, we will discuss the differences between these various approaches in order to better understand their individual roles within an organization. We then explore how Machine Learning, Model Management and Data Infrastructure intersect in MLOps. Finally, we discuss both the benefits and challenges when it comes to implementing these operations systems. MLOps, ModelOps, DataOps and AIOps are rapidly growing in importance as organizations look to leverage the power of artificial intelligence, machine learning and big data.

dataop and aiop, mlop, modelop, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The 2023 MAD (Machine Learning, Artificial Intelligence & Data) Landscape – Matt Turck

#artificialintelligenceFeb-24-2023, 01:00:33 GMT

It has been less than 18 months since we published our last MAD landscape, and it has been full of drama. When we left, the data world was booming in the wake of the gigantic Snowflake IPO, with a whole ecosystem of startups organizing around it. Since then, of course, public markets crashed, a recessionary economy appeared and VC funding dried up. A whole generation of data/AI startups has had to adapt to a new reality. Meanwhile, the last few months saw the unmistakable, exponential acceleration of Generative AI, with arguably the formation of a new mini-bubble.

artificial intelligence, landscape, machine learning, (12 more...)

#artificialintelligence

Industry: Government (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback